This research explores the utilization of machine learning techniques to enhance short-term earthquake forecasting, contributing to improved disaster preparedness and risk reduction. A detailed review of both conventional and data-driven seismic prediction methods was conducted, revealing notable limitations in current systems. To address these challenges, the proposed framework—SeismoCastNet—employs a hybrid approach by integrating five classification models, including Random Forest, Gradient Boosting, and Support Vector Machine (SVM). The models were trained and validated using historical earthquake datasets encompassing attributes such as magnitude, depth, and geographical coordinates. Experimental outcomes demonstrate that Gradient Boosting delivers the most consistent and reliable performance, achieving an overall accuracy of 96%, a minor class F1-score of 0.979, and a major class F1-score of 0.545. While SVM attained the highest precision for minor class predictions (99%), its performance for major seismic events was relatively lower. The findings underscore the potential of ensemble learning strategies to effectively handle class imbalance and improve the predictive capability of earthquake detection systems.
Introduction
Earthquakes are unpredictable natural disasters causing major damage to life and infrastructure.
Forecasting earthquakes remains challenging despite advances in instrumentation and alert systems.
Forecasting is categorized into:
Long-term: Uses historical data, fault-line analysis, and GPS deformation tracking.
Short-term: Focuses on imminent predictions (days/weeks); uses foreshocks, gas emissions, and now machine learning (ML).
Key Goal: Improve short-term prediction accuracy using ML classification algorithms on seismic datasets.
2. Research Objective
Develop and evaluate SeismoCastNet, a predictive framework using ML.
Compare classifiers (SVM, Random Forest, Gradient Boosting) on their ability to identify minor vs major seismic events.
Evaluate performance using metrics like precision, recall, F1-score.
Aim to support early warning systems with high sensitivity to major earthquakes.
3. Related Works
Numerous studies explore ML in earthquake prediction:
Anitha et al.: Ensemble methods outperform individual classifiers.
Yavas et al.: Hybrid models improve urban seismic forecasting.
Ommi & Hashemi: Regional features improve model performance.
Rusho et al.: Attention-driven LSTM enhances long-term pattern recognition.
Katole et al.: Quantum SVM for enhanced precision.
Tiwari et al.: ML used in predicting magnitudes in the Himalayas.
ML methods range from classical classifiers to deep learning, ensemble models, quantum computing, and hybrid techniques.
4. Methodology
Data & Sample
Source: USGS Earthquake Catalog (via Kaggle)
Total records: 3.4 million+ earthquakes
Sample: 1,000 stratified records, maintaining the imbalance between minor and major events.
Data Preprocessing
PCA reduced 21 features to 12 components.
Features: Magnitude, depth, coordinates, time, etc.
Data was cleaned, normalized, and labeled for binary classification.
Modeling Framework
Models used:
Random Forest (ensemble decision trees)
Gradient Boosting (iterative tree correction)
SVM (kernel-based linear separator)
Target: Classify minor vs major earthquakes.
Metrics focused on F1-score to handle imbalanced data.
Threshold tuning applied to improve major event detection.
5. Evaluation Metrics
Accuracy: (TP + TN) / Total
Precision: TP / (TP + FP)
Recall (Sensitivity): TP / (TP + FN)
F1-Score: Harmonic mean of precision and recall
Priority was given to recall of major earthquakes, due to their critical impact.
6. Results & Analysis
Performance of Models
Model
Accuracy
Minor F1
Major F1
Minor Recall
Major Recall
Random Forest
94.5%
0.971
0.455
95.24%
71.88%
Gradient Boosting
96.0%
0.979
0.545
96.69%
75.00%
SVM
96.4%
0.981
0.333
98.65%
28.12%
SVM had the highest overall accuracy but worst recall for major events.
Gradient Boosting achieved the best balance, with strong performance on both minor and major events.
Decision Tree (not shown in table) had the highest recall for major class (81.25%) but produced many false positives.
Confusion Matrix Highlights
Model
TP (Major)
TN
FP
FN
Random Forest
23
922
46
9
Gradient Boosting
24
936
32
8
SVM
9
955
13
23
Gradient Boosting had the best trade-off: high TP, low FN, and acceptable FP—ideal for real-time deployment.
7. Final Model Selection
Gradient Boosting selected as the optimal model due to:
Balanced performance (especially on major events)
High recall and F1-score for both classes
Scalable and moderately demanding computationally
Deployed via a Flask web application for real-time prediction
8. Key Observations
Minor earthquake prediction is consistently high across all models.
Major earthquake prediction requires more sensitive and balanced models (e.g., Gradient Boosting).
Accuracy alone is misleading in imbalanced datasets—recall and F1-score are more informative.
PCA and stratified sampling enhance model efficiency and fairness.
Conclusion
This research conducted a comparative analysis of various machine learning algorithms for classifying earthquake events using a comprehensive seismic dataset. Models including Random Forest, Gradient Boosting, and Support Vector Machine (SVM) were evaluated based on key performance indicators such as accuracy, F1-score, and confusion matrix statistics. Among these, Gradient Boosting emerged as the most effective model, offering a balanced classification performance for both minor and major seismic events. The results underscore the potential of machine learning as a powerful tool in seismic prediction, particularly when models are carefully optimized and assessed on class-specific metrics. Although overall accuracy was high for all models, the variation in recall scores for the major class revealed a critical area for improvement. This highlights the importance of focusing not just on aggregate accuracy but on the ability to correctly identify high-impact, rare seismic events.
Future research will aim to enhance the precision and reliability of predictions for major earthquakes, explore methods to reduce false negatives, and improve the generalizability of models across diverse geographical and geological conditions. These advancements will contribute to the development of more robust, real-time early warning systems for disaster preparedness and mitigation.
References
[1] Prediction of major earth quake events using different machine learning algorithms M.AnithaK.Hareesh,P.Bhuvaneswari.”in IJARST,
DOI:10.48047/ijarst/V14/05/74
[2] C. Emre Yavas, L. Chen, C. Kadlec and Y. Ji, \"Predictive Modeling of Earthquakes in Los Angeles With Machine Learning and Neural Networks,\" in IEEE Access, vol. 12, pp. 108673-108702, 2024, doi: 10.1109/ACCESS.2024.3438556.
[3] Salma Ommi, Mohammad Hashemi,”Machine learning technique in the north zagros earthquake prediction,Applied Computing and Geosciences”,Volume 22,2024,100163,
[4] Gentili, Stefania, Giuseppe Davide Chiappetta, Giuseppe Petrillo, Piero Brondi, and Jiancang Zhuang. \"Forecasting Strong Subsequent Earthquakes in Japan using an improved version of NESTORE Machine Learning Algorithm.\" arXiv e-prints (2024): arXiv-2408.
[5] Maher Ali Rusho, Reyhan Azizova, Dmytro Mykhalevskiy, Maksym Karyonov, & Heyran Hasanova. (2024). Advanced Earthquake prediction: unifying networks,algorithms, and attention-driven LSTM modeling. GEOMATE Journal, 27(119), 135–142.
[6] Ramírez Eudave, R., Ferreira, T. M., Vicente, R., Lourenco, P. B., & Peña, F. (2023). Parametric and Machine Learning-Based Analysis of the Seismic Vulnerability of Adobe Historical Buildings Damaged After the September 2017 Mexico Earthquakes. International Journal of Architectural Heritage, 18(6), 940–963.
[7] A. Katole, V. Batheja, A. Deshmukh, F. Dekate and A. Soni, \"Earthquake Prediction Using QSVM,\" 2024 IEEE International Students\' Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, India, 2024, pp. 1-7, doi: 10.1109/SCEECS61402.2024.10482242.
[8] M. S. Abdalzaher, M. S. Soliman and S. M. El-Hady, \"Seismic Intensity Estimation for Earthquake Early Warning Using Optimized Machine Learning Model,\" in IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1-11, 2023, Art no. 5914211, doi: 10.1109/TGRS.2023.3296520.
[9] Abdul Salam, Mustafa & Abdelminaam, & Salama, Dr-Diaa. (2021). Earthquake Prediction using Hybrid Machine Learning Techniques. International Journal of Advanced Computer Science and Applications. 12. 654-665. 10.14569/IJACSA.2021.0120578.
[10] Kafadar, Ö., Tunç, S. & Tunç, B. ESenTRy: an on-site earthquake early warning system based on the instrumental modified Mercalli intensity. Earth Sci Inform 17, 5027–5041 (2024). https://doi.org/10.1007/s12145-024-01407-2
[11] Y. Zhao, S. Lv and P. Liu, \"Advances in Earthquake Prevention and Reduction Based on Machine Learning: A Scopinf Review,” in IEEE Access, Vol.12,pp.143908-143929, 2024.
[12] Debnath, P.; Chittora, P.; Chakrabarti, T.; Chakrabarti, P.; Leonowicz, Z.; Jasinski, M.; Gono, R.; Jasi?ska, E. Analysis of Earthquake Forecasting in India Using Supervised Machine Learning Classifiers. Sustainability 2021, 13, 971.
[13] K C, Sajan & Bhusal, Anish & Gautam, Dipendra & Rupakhety, Rajesh. (2022). Earthquake damage and rehabilitation intervention prediction using machine learning.
[14] Asim, K.M., Martínez-Álvarez, F., Basit, A. et al. Earthquake magnitude prediction in Hindukush region using machine learning techniques.Natural Hazards 85,471–486.(2017).
[15] “Anomalies Prediction in Radon Time Series for Earthquake Likelihood Using Machine Learning-Based Ensemble Model,\" in IEEE Access, vol. 10, pp. 37984-37999, 2022, doi: 10.1109/ACCESS.2022.3163291.
Adil Aslam mir 1,2, Fatih Vebhi Çeleble 1 , Hadeel Alsola , Shahzad Ahmed qurish 4 , Muhamad rafique
[16] Mallouhy, Roxane, et al. \"Major earthquake event prediction using various machine learning algorithms.\" 2019 international conference on information and communication technologies for disaster management (ICT-DM). IEEE, 2019.
[17] Tiwari, Ram Krishna, Rudra Prasad Poudel, and Harihar Paudyal. \"Machine learning for predicting earthquake magnitudes in the Central Himalaya.\" BIBECHANA 22.1 (2025): 22-29.